Retrieving Best Entry Points in Semi-Structured Documents

نویسندگان

  • Eugen Popovici
  • Pierre-François Marteau
  • Gildas Ménier
چکیده

Focused structured document retrieval tries to make use of the concept of best entry point (BEP) which is intended to define from a user’s perspective the starting-point from which browsing relevant document components should be optimally initiated [5]. In this paper we describe a simple, efficient and effective method for providing BEPs candidates in XML documents. Experiments conducted within the framework of INEX 2006 evaluation campaign ranked the proposed approach on the 1 place out of 77 official submissions for the Best In Context Task. Secondly we compare the effectiveness of the approach with a standard 'flat' document retrieval system that returns document snippets as BEPs. The experimental results on the Wikipedia collection [3] show that BEPs are considered as useful features from the users points of view.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Best entry points for structured document retrieval - Part II: Types, usage and effectiveness

Structured document retrieval makes use of document components as the basis of the retrieval process, rather than complete documents. The inherent relationships between these components make it vital to support users’ natural browsing behaviour in order to offer effective and efficient access to structured documents. This paper examines the concept of best entry points, which are document compo...

متن کامل

Focussed Structured Document Retrieval

Focussed structured document retrieval aims at retrieving best entry points from where users can browse to access relevant document components in the document structure. In this paper, we report on the development, implementation and evaluation of best entry point retrieval strategies derived from user studies designed to elicit what constitutes a best entry point.

متن کامل

Best entry points for structured document retrieval - Part I: Characteristics

Structured document retrieval makes use of document components as the basis of the retrieval process, rather than complete documents. The inherent relationships between these components make it vital to support users’ natural browsing behaviour in order to offer effective and efficient access to structured documents. This paper examines the concept of best entry points, which are document compo...

متن کامل

A Model for the Representation and Focussed Retrieval of Structured Documents Based on Fuzzy Aggregation

Effective retrieval of structured documents should exploit the content and structural knowledge associated with the documents. This knowledge can be used to focus retrieval to the best entry points: document components that contain relevant information, and from which users can browse to retrieve further relevant components. To enable this, suitable representation methods must be developed. Thi...

متن کامل

Construction of a Test Collection for the Focussed Retrieval of Structured Documents

In this paper, we examine the methodological issues involved in constructing test collections of structured documents and obtaining best entry points for the evaluation of the focussed retrieval of document components. We describe a pilot test of the proposed test collection construction methodology performed on a document collection of Shakespeare plays. In our analysis, we examine the effect ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007